Overview

Dataset statistics

Number of variables23
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.6 MiB
Average record size in memory586.2 B

Variable types

NUM13
CAT8
BOOL2

Reproduction

Analysis started2020-11-06 04:07:08.966736
Analysis finished2020-11-06 04:08:00.302105
Versionpandas-profiling v2.6.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml
emp_length has 250 (2.5%) zeros Zeros
delinq_2yrs has 8915 (89.1%) zeros Zeros
inq_last_6mths has 4607 (46.1%) zeros Zeros
mths_since_last_delinq has 6479 (64.8%) zeros Zeros
mths_since_last_record has 267 (2.7%) zeros Zeros
revol_bal has 278 (2.8%) zeros Zeros
revol_util has 254 (2.5%) zeros Zeros

Variables

is_bad
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
0
8705
1
 
1295
ValueCountFrequency (%) 
0 8705 87.1%
 
1 1295 13.0%
 

emp_length
Real number (ℝ≥0)

ZEROS
Distinct count11
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.561873741
Minimum0
Maximum2.397895273
Zeros250
Zeros (%)2.5%
Memory size78.2 KiB

Quantile statistics

Minimum0
5-th percentile0.6931471806
Q11.098612289
median1.609437912
Q32.197224577
95-th percentile2.397895273
Maximum2.397895273
Range2.397895273
Interquartile range (IQR)1.098612289

Descriptive statistics

Standard deviation0.6738534717
Coefficient of variation (CV)0.4314391451
Kurtosis-1.064658501
Mean1.561873741
Median Absolute Deviation (MAD)0.5850798667
Skewness-0.2784256782
Sum15618.73741
Variance0.4540785013
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.34657359 1.49786614 1.86883481 2.35024018 2.39789527], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
2.397895273 2168 21.7%
 
0.6931471806 2083 20.8%
 
1.098612289 1183 11.8%
 
1.386294361 1010 10.1%
 
1.609437912 889 8.9%
 
1.791759469 779 7.8%
 
1.945910149 535 5.3%
 
2.079441542 421 4.2%
 
2.197224577 351 3.5%
 
2.302585093 331 3.3%
 
ValueCountFrequency (%) 
0 250 2.5%
 
0.6931471806 2083 20.8%
 
1.098612289 1183 11.8%
 
1.386294361 1010 10.1%
 
1.609437912 889 8.9%
 
ValueCountFrequency (%) 
2.397895273 2168 21.7%
 
2.302585093 331 3.3%
 
2.197224577 351 3.5%
 
2.079441542 421 4.2%
 
1.945910149 535 5.3%
 

home_ownership
Categorical

Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
rent
4745
mortgage
4445
own
 
775
other
 
34
none
 
1
ValueCountFrequency (%) 
rent 4745 47.4%
 
mortgage 4445 44.5%
 
own 775 7.8%
 
other 34 0.3%
 
none 1 < 0.1%
 

Length

Max length8
Mean length5.7039
Min length3
ValueCountFrequency (%) 
Lowercase_Letter 10 100.0%
 
ValueCountFrequency (%) 
Latin 10 100.0%
 
ValueCountFrequency (%) 
ASCII 10 100.0%
 

annual_inc
Real number (ℝ≥0)

Distinct count1901
Unique (%)19.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.96245435
Minimum7.601402335
Maximum13.71015115
Zeros0
Zeros (%)0.0%
Memory size78.2 KiB

Quantile statistics

Minimum7.601402335
5-th percentile10.07525073
Q110.59665973
median10.96821553
Q311.31448672
95-th percentile11.87427119
Maximum13.71015115
Range6.108748819
Interquartile range (IQR)0.7178269885

Descriptive statistics

Standard deviation0.5693905148
Coefficient of variation (CV)0.05194005798
Kurtosis1.159760395
Mean10.96245435
Median Absolute Deviation (MAD)0.4387573392
Skewness0.01283349875
Sum109624.5435
Variance0.3242055584
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 7.60140233 8.3285876 9.13735675 9.38518907 9.40264576 ... 12.43912147 12.60313756 12.61486335 12.9479415 13.71015115], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
11.00211651 381 3.8%
 
10.81979828 267 2.7%
 
10.59665973 222 2.2%
 
11.22525673 213 2.1%
 
10.30898599 211 2.1%
 
11.08215793 204 2.0%
 
10.77897712 196 2.0%
 
11.15626481 193 1.9%
 
10.71443999 181 1.8%
 
11.28979441 170 1.7%
 
Other values (1891) 7762 77.6%
 
ValueCountFrequency (%) 
7.601402335 1 < 0.1%
 
8.314097335 1 < 0.1%
 
8.343077871 2 < 0.1%
 
8.476579509 2 < 0.1%
 
8.517393171 2 < 0.1%
 
ValueCountFrequency (%) 
13.71015115 2 < 0.1%
 
13.66468883 1 < 0.1%
 
13.56705048 1 < 0.1%
 
13.51979766 1 < 0.1%
 
13.49392831 1 < 0.1%
 
Distinct count3
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
not verified
4367
verified - income
3214
verified - income source
2419
ValueCountFrequency (%) 
not verified 4367 43.7%
 
verified - income 3214 32.1%
 
verified - income source 2419 24.2%
 

Length

Max length24
Mean length16.5098
Min length12
ValueCountFrequency (%) 
Lowercase_Letter 13 86.7%
 
Space_Separator 1 6.7%
 
Dash_Punctuation 1 6.7%
 
ValueCountFrequency (%) 
Latin 13 86.7%
 
Common 2 13.3%
 
ValueCountFrequency (%) 
ASCII 15 100.0%
 

pymnt_plan
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
n
9998
y
 
2
ValueCountFrequency (%) 
n 9998 > 99.9%
 
y 2 < 0.1%
 

purpose_cat
Categorical

Distinct count27
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
debt consolidation
4454
credit card
1273
other
1026
home improvement
 
800
major purchase
 
546
Other values (22)
1901
ValueCountFrequency (%) 
debt consolidation 4454 44.5%
 
credit card 1273 12.7%
 
other 1026 10.3%
 
home improvement 800 8.0%
 
major purchase 546 5.5%
 
small business 461 4.6%
 
car 349 3.5%
 
wedding 250 2.5%
 
medical 183 1.8%
 
moving 159 1.6%
 
Other values (17) 499 5.0%
 

Length

Max length33
Mean length13.9381
Min length3
ValueCountFrequency (%) 
Lowercase_Letter 21 95.5%
 
Space_Separator 1 4.5%
 
ValueCountFrequency (%) 
Latin 21 95.5%
 
Common 1 4.5%
 
ValueCountFrequency (%) 
ASCII 22 100.0%
 

addr_state
Categorical

Distinct count50
Unique (%)0.5%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
ca
1748
ny
 
958
fl
 
714
tx
 
700
nj
 
482
Other values (45)
5398
ValueCountFrequency (%) 
ca 1748 17.5%
 
ny 958 9.6%
 
fl 714 7.1%
 
tx 700 7.0%
 
nj 482 4.8%
 
va 392 3.9%
 
il 386 3.9%
 
pa 378 3.8%
 
ga 357 3.6%
 
ma 331 3.3%
 
Other values (40) 3554 35.5%
 

Length

Max length2
Mean length2
Min length2
ValueCountFrequency (%) 
Lowercase_Letter 24 100.0%
 
ValueCountFrequency (%) 
Latin 24 100.0%
 
ValueCountFrequency (%) 
ASCII 24 100.0%
 

debt_to_income
Real number (ℝ≥0)

Distinct count2585
Unique (%)25.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.338704
Minimum0
Maximum29.99
Zeros58
Zeros (%)0.6%
Memory size78.2 KiB

Quantile statistics

Minimum0
5-th percentile2.129
Q18.16
median13.41
Q318.6925
95-th percentile23.93
Maximum29.99
Range29.99
Interquartile range (IQR)10.5325

Descriptive statistics

Standard deviation6.754211507
Coefficient of variation (CV)0.5063619004
Kurtosis-0.8546793248
Mean13.338704
Median Absolute Deviation (MAD)5.669516109
Skewness-0.008777611376
Sum133387.04
Variance45.61937308
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 0. 0.055 0.195 3.395 7.675 20.325 22.835 24.965 26.885 29.99 ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 58 0.6%
 
12.48 16 0.2%
 
13.51 13 0.1%
 
10 13 0.1%
 
19.2 13 0.1%
 
18.14 13 0.1%
 
4.8 12 0.1%
 
17.82 12 0.1%
 
15.38 12 0.1%
 
22.43 12 0.1%
 
Other values (2575) 9826 98.3%
 
ValueCountFrequency (%) 
0 58 0.6%
 
0.11 1 < 0.1%
 
0.12 1 < 0.1%
 
0.13 1 < 0.1%
 
0.14 2 < 0.1%
 
ValueCountFrequency (%) 
29.99 1 < 0.1%
 
29.93 1 < 0.1%
 
29.92 1 < 0.1%
 
29.83 1 < 0.1%
 
29.74 1 < 0.1%
 

delinq_2yrs
Real number (ℝ≥0)

ZEROS
Distinct count10
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.08913850185
Minimum0
Maximum2.48490665
Zeros8915
Zeros (%)89.1%
Memory size78.2 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0.6931471806
Maximum2.48490665
Range2.48490665
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.2688248344
Coefficient of variation (CV)3.015810552
Kurtosis10.21328108
Mean0.08913850185
Median Absolute Deviation (MAD)0.1589339488
Skewness3.152153266
Sum891.3850185
Variance0.07226679162
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.34657359 0.89587973 1.24245332 1.49786614 1.86883481 2.48490665], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 8915 89.1%
 
0.6931471806 822 8.2%
 
1.098612289 186 1.9%
 
1.386294361 50 0.5%
 
1.609437912 14 0.1%
 
1.791759469 6 0.1%
 
1.945910149 3 < 0.1%
 
2.079441542 2 < 0.1%
 
2.197224577 1 < 0.1%
 
2.48490665 1 < 0.1%
 
ValueCountFrequency (%) 
0 8915 89.1%
 
0.6931471806 822 8.2%
 
1.098612289 186 1.9%
 
1.386294361 50 0.5%
 
1.609437912 14 0.1%
 
ValueCountFrequency (%) 
2.48490665 1 < 0.1%
 
2.197224577 1 < 0.1%
 
2.079441542 2 < 0.1%
 
1.945910149 3 < 0.1%
 
1.791759469 6 0.1%
 

inq_last_6mths
Real number (ℝ≥0)

ZEROS
Distinct count20
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5437069499
Minimum0
Maximum3.258096538
Zeros4607
Zeros (%)46.1%
Memory size78.2 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.6931471806
Q31.098612289
95-th percentile1.609437912
Maximum3.258096538
Range3.258096538
Interquartile range (IQR)1.098612289

Descriptive statistics

Standard deviation0.5739829645
Coefficient of variation (CV)1.055684436
Kurtosis-0.4749831683
Mean0.5437069499
Median Absolute Deviation (MAD)0.5009715837
Skewness0.6386035865
Sum5437.069499
Variance0.3294564436
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.34657359 0.89587973 1.24245332 1.49786614 1.86883481 2.01267585 2.24990484 2.35024018 3.25809654], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 4607 46.1%
 
0.6931471806 2684 26.8%
 
1.098612289 1431 14.3%
 
1.386294361 731 7.3%
 
1.609437912 227 2.3%
 
1.791759469 152 1.5%
 
1.945910149 76 0.8%
 
2.079441542 42 0.4%
 
2.197224577 27 0.3%
 
2.302585093 10 0.1%
 
Other values (10) 13 0.1%
 
ValueCountFrequency (%) 
0 4607 46.1%
 
0.6931471806 2684 26.8%
 
1.098612289 1431 14.3%
 
1.386294361 731 7.3%
 
1.609437912 227 2.3%
 
ValueCountFrequency (%) 
3.258096538 1 < 0.1%
 
3.218875825 1 < 0.1%
 
2.944438979 2 < 0.1%
 
2.890371758 1 < 0.1%
 
2.833213344 1 < 0.1%
 

mths_since_last_delinq
Real number (ℝ≥0)

ZEROS
Distinct count91
Unique (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.211956373
Minimum0
Maximum4.795790546
Zeros6479
Zeros (%)64.8%
Memory size78.2 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q33.17805383
95-th percentile4.189654742
Maximum4.795790546
Range4.795790546
Interquartile range (IQR)3.17805383

Descriptive statistics

Standard deviation1.699849176
Coefficient of variation (CV)1.402566309
Kurtosis-1.224158213
Mean1.211956373
Median Absolute Deviation (MAD)1.571733034
Skewness0.7866357636
Sum12119.56373
Variance2.889487221
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.34657359 0.89587973 1.24245332 1.86883481 ... 2.86179255 3.15677402 3.51143404 4.4248287 4.79579055], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 6479 64.8%
 
3.433987204 69 0.7%
 
3.555348061 66 0.7%
 
3.663561646 65 0.7%
 
3.17805383 65 0.7%
 
3.218875825 64 0.6%
 
3.80666249 64 0.6%
 
3.526360525 63 0.6%
 
3.044522438 63 0.6%
 
2.944438979 61 0.6%
 
Other values (81) 2941 29.4%
 
ValueCountFrequency (%) 
0 6479 64.8%
 
0.6931471806 6 0.1%
 
1.098612289 29 0.3%
 
1.386294361 40 0.4%
 
1.609437912 37 0.4%
 
ValueCountFrequency (%) 
4.795790546 1 < 0.1%
 
4.753590191 1 < 0.1%
 
4.584967479 1 < 0.1%
 
4.574710979 1 < 0.1%
 
4.564348191 1 < 0.1%
 

mths_since_last_record
Real number (ℝ≥0)

ZEROS
Distinct count94
Unique (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.641291617
Minimum0
Maximum4.787491743
Zeros267
Zeros (%)2.7%
Memory size78.2 KiB

Quantile statistics

Minimum0
5-th percentile4.521788577
Q14.787491743
median4.787491743
Q34.787491743
95-th percentile4.787491743
Maximum4.787491743
Range4.787491743
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.7769844026
Coefficient of variation (CV)0.1674069347
Kurtosis31.06395734
Mean4.641291617
Median Absolute Deviation (MAD)0.2707703644
Skewness-5.708959044
Sum46412.91617
Variance0.6037047619
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.97295507 2.9674471 3.63723978 3.99815862 4.43659395 4.46012771 4.78330762 4.78749174], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
4.787491743 9163 91.6%
 
0 267 2.7%
 
4.49980967 21 0.2%
 
4.762173935 18 0.2%
 
4.465908119 17 0.2%
 
4.477336814 17 0.2%
 
4.532599493 17 0.2%
 
4.744932128 16 0.2%
 
4.615120517 16 0.2%
 
4.65396035 16 0.2%
 
Other values (84) 432 4.3%
 
ValueCountFrequency (%) 
0 267 2.7%
 
1.945910149 1 < 0.1%
 
2.48490665 1 < 0.1%
 
2.890371758 1 < 0.1%
 
3.044522438 2 < 0.1%
 
ValueCountFrequency (%) 
4.787491743 9163 91.6%
 
4.779123493 11 0.1%
 
4.770684624 10 0.1%
 
4.762173935 18 0.2%
 
4.753590191 10 0.1%
 

open_acc
Real number (ℝ≥0)

Distinct count36
Unique (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.241761756
Minimum0.6931471806
Maximum3.688879454
Zeros0
Zeros (%)0.0%
Memory size78.2 KiB

Quantile statistics

Minimum0.6931471806
5-th percentile1.386294361
Q11.945910149
median2.302585093
Q32.564949357
95-th percentile2.944438979
Maximum3.688879454
Range2.995732274
Interquartile range (IQR)0.6190392084

Descriptive statistics

Standard deviation0.439900357
Coefficient of variation (CV)0.1962297536
Kurtosis-0.07499054695
Mean2.241761756
Median Absolute Deviation (MAD)0.3524184816
Skewness-0.2090961395
Sum22417.61756
Variance0.1935123241
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0.69314718 0.89587973 1.24245332 1.49786614 1.70059869 ... 3.02012736 3.11326833 3.2769667 3.44986155 3.68887945], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
2.079441542 1035 10.3%
 
1.945910149 990 9.9%
 
2.197224577 937 9.4%
 
2.302585093 934 9.3%
 
2.397895273 805 8.1%
 
1.791759469 763 7.6%
 
2.48490665 692 6.9%
 
1.609437912 631 6.3%
 
2.564949357 577 5.8%
 
2.63905733 487 4.9%
 
Other values (26) 2149 21.5%
 
ValueCountFrequency (%) 
0.6931471806 7 0.1%
 
1.098612289 163 1.6%
 
1.386294361 374 3.7%
 
1.609437912 631 6.3%
 
1.791759469 763 7.6%
 
ValueCountFrequency (%) 
3.688879454 1 < 0.1%
 
3.610917913 2 < 0.1%
 
3.583518938 1 < 0.1%
 
3.526360525 3 < 0.1%
 
3.496507561 1 < 0.1%
 

pub_rec
Categorical

Distinct count4
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
0
9427
0.6931471806
 
550
1.098612289
 
18
1.386294361
 
5
ValueCountFrequency (%) 
0 9427 94.3%
 
0.6931471806 550 5.5%
 
1.098612289 18 0.2%
 
1.386294361 5 0.1%
 

Length

Max length18
Mean length3.8595
Min length3
ValueCountFrequency (%) 
Decimal_Number 10 90.9%
 
Other_Punctuation 1 9.1%
 
ValueCountFrequency (%) 
Common 11 100.0%
 
ValueCountFrequency (%) 
ASCII 11 100.0%
 

revol_bal
Real number (ℝ≥0)

ZEROS
Distinct count8130
Unique (%)81.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.665349596
Minimum0
Maximum14.00394672
Zeros278
Zeros (%)2.8%
Memory size78.2 KiB

Quantile statistics

Minimum0
5-th percentile5.631032248
Q18.167777519
median9.064909879
Q39.738214835
95-th percentile10.70449851
Maximum14.00394672
Range14.00394672
Interquartile range (IQR)1.570437316

Descriptive statistics

Standard deviation1.964604275
Coefficient of variation (CV)0.2267195631
Kurtosis8.594573757
Mean8.665349596
Median Absolute Deviation (MAD)1.250096946
Skewness-2.538633482
Sum86653.49596
Variance3.859669956
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 0. 0.34657359 2.44140096 4.74056529 5.47226185 ... 10.76851656 11.16428945 11.70020365 12.05977432 14.00394672], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 278 2.8%
 
7.708859601 6 0.1%
 
7.475339237 6 0.1%
 
6.634633358 5 0.1%
 
8.476787777 5 0.1%
 
9.361257257 5 0.1%
 
7.98480339 4 < 0.1%
 
5.693732139 4 < 0.1%
 
5.620400866 4 < 0.1%
 
5.96100534 4 < 0.1%
 
Other values (8120) 9679 96.8%
 
ValueCountFrequency (%) 
0 278 2.8%
 
0.6931471806 2 < 0.1%
 
1.386294361 2 < 0.1%
 
1.791759469 1 < 0.1%
 
1.945910149 2 < 0.1%
 
ValueCountFrequency (%) 
14.00394672 1 < 0.1%
 
13.30887614 1 < 0.1%
 
13.14012864 1 < 0.1%
 
13.09723017 1 < 0.1%
 
12.95557653 1 < 0.1%
 

revol_util
Real number (ℝ≥0)

ZEROS
Distinct count1027
Unique (%)10.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean48.451419
Minimum0
Maximum100.6
Zeros254
Zeros (%)2.5%
Memory size78.2 KiB

Quantile statistics

Minimum0
5-th percentile2.8
Q125
median48.7
Q371.8
95-th percentile93.6
Maximum100.6
Range100.6
Interquartile range (IQR)46.8

Descriptive statistics

Standard deviation28.18384582
Coefficient of variation (CV)0.5816928874
Kurtosis-1.094338897
Mean48.451419
Median Absolute Deviation (MAD)24.0658605
Skewness-0.01681450313
Sum484514.19
Variance794.3291651
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0.000e+00 1.500e-02 6.500e-02 1.100e-01 2.365e+01 ... 4.875e+01 8.555e+01 9.795e+01 9.995e+01 1.006e+02], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 254 2.5%
 
48.7 34 0.3%
 
46.6 21 0.2%
 
43.4 20 0.2%
 
0.1 20 0.2%
 
55.4 19 0.2%
 
70 19 0.2%
 
47.6 19 0.2%
 
53.6 19 0.2%
 
56.8 19 0.2%
 
Other values (1017) 9556 95.6%
 
ValueCountFrequency (%) 
0 254 2.5%
 
0.03 1 < 0.1%
 
0.1 20 0.2%
 
0.12 1 < 0.1%
 
0.2 11 0.1%
 
ValueCountFrequency (%) 
100.6 1 < 0.1%
 
100 1 < 0.1%
 
99.9 4 < 0.1%
 
99.8 5 0.1%
 
99.7 3 < 0.1%
 

total_acc
Real number (ℝ≥0)

Distinct count75
Unique (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.997662856
Minimum0.6931471806
Maximum4.510859507
Zeros0
Zeros (%)0.0%
Memory size78.2 KiB

Quantile statistics

Minimum0.6931471806
5-th percentile1.945910149
Q12.63905733
median3.044522438
Q33.401197382
95-th percentile3.80666249
Maximum4.510859507
Range3.817712326
Interquartile range (IQR)0.762140052

Descriptive statistics

Standard deviation0.5501983497
Coefficient of variation (CV)0.1835424383
Kurtosis0.007111528285
Mean2.997662856
Median Absolute Deviation (MAD)0.4413015679
Skewness-0.4832304342
Sum29976.62856
Variance0.302718224
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0.69314718 1.24245332 1.49786614 1.70059869 1.86883481 ... 3.90192165 3.99815862 4.1510089 4.16663518 4.51085951], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
2.772588722 369 3.7%
 
3.044522438 365 3.6%
 
2.890371758 360 3.6%
 
2.564949357 357 3.6%
 
2.708050201 351 3.5%
 
2.995732274 346 3.5%
 
2.833213344 340 3.4%
 
2.944438979 339 3.4%
 
2.63905733 331 3.3%
 
3.135494216 329 3.3%
 
Other values (65) 6513 65.1%
 
ValueCountFrequency (%) 
0.6931471806 3 < 0.1%
 
1.098612289 10 0.1%
 
1.386294361 58 0.6%
 
1.609437912 115 1.1%
 
1.791759469 144 1.4%
 
ValueCountFrequency (%) 
4.510859507 1 < 0.1%
 
4.406719247 1 < 0.1%
 
4.394449155 1 < 0.1%
 
4.382026635 1 < 0.1%
 
4.369447852 1 < 0.1%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
f
9983
m
 
17
ValueCountFrequency (%) 
f 9983 99.8%
 
m 17 0.2%
 

Length

Max length1
Mean length1
Min length1
ValueCountFrequency (%) 
Lowercase_Letter 2 100.0%
 
ValueCountFrequency (%) 
Latin 2 100.0%
 
ValueCountFrequency (%) 
ASCII 2 100.0%
 
Distinct count3
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
1.098612289
3424
1.386294361
3299
0.6931471806
3277
ValueCountFrequency (%) 
1.098612289 3424 34.2%
 
1.386294361 3299 33.0%
 
0.6931471806 3277 32.8%
 

Length

Max length18
Mean length18
Min length18
ValueCountFrequency (%) 
Decimal_Number 10 90.9%
 
Other_Punctuation 1 9.1%
 
ValueCountFrequency (%) 
Common 11 100.0%
 
ValueCountFrequency (%) 
ASCII 11 100.0%
 

policy_code
Categorical

Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size78.2 KiB
pc3
2098
pc5
2025
pc1
1978
pc2
1962
pc4
1937
ValueCountFrequency (%) 
pc3 2098 21.0%
 
pc5 2025 20.2%
 
pc1 1978 19.8%
 
pc2 1962 19.6%
 
pc4 1937 19.4%
 

Length

Max length3
Mean length3
Min length3
ValueCountFrequency (%) 
Decimal_Number 5 71.4%
 
Lowercase_Letter 2 28.6%
 
ValueCountFrequency (%) 
Common 5 71.4%
 
Latin 2 28.6%
 
ValueCountFrequency (%) 
ASCII 7 100.0%
 

cr_line_yrs
Real number (ℝ≥0)

Distinct count50
Unique (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.599902149
Minimum7.586296307
Maximum7.635303886
Zeros0
Zeros (%)0.0%
Memory size78.2 KiB

Quantile statistics

Minimum7.586296307
5-th percentile7.593374193
Q17.598399329
median7.600402335
Q37.60190196
95-th percentile7.604396349
Maximum7.635303886
Range0.04900757911
Interquartile range (IQR)0.003502630551

Descriptive statistics

Standard deviation0.003860579862
Coefficient of variation (CV)0.0005079775747
Kurtosis20.53110444
Mean7.599902149
Median Absolute Deviation (MAD)0.002623110056
Skewness1.673860451
Sum75999.02149
Variance1.490407687e-05
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[7.58629631 7.58908285 7.59211392 7.59312224 7.59463281 ... 7.60514342 7.61573756 7.63118889 7.6340954 7.63530389], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
7.601402335 839 8.4%
 
7.600402335 753 7.5%
 
7.60090246 715 7.1%
 
7.60190196 642 6.4%
 
7.599901959 601 6.0%
 
7.599401333 592 5.9%
 
7.598900457 518 5.2%
 
7.598399329 513 5.1%
 
7.602401336 503 5.0%
 
7.602900462 455 4.5%
 
Other values (40) 3869 38.7%
 
ValueCountFrequency (%) 
7.586296307 14 0.1%
 
7.586803535 11 0.1%
 
7.587310506 13 0.1%
 
7.58781722 18 0.2%
 
7.588323677 14 0.1%
 
ValueCountFrequency (%) 
7.635303886 9 0.1%
 
7.634820678 7 0.1%
 
7.634337236 6 0.1%
 
7.63385356 2 < 0.1%
 
7.63336965 2 < 0.1%
 

cr_line_mths
Real number (ℝ≥0)

Distinct count12
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.922837082
Minimum0.6931471806
Maximum2.564949357
Zeros0
Zeros (%)0.0%
Memory size78.2 KiB

Quantile statistics

Minimum0.6931471806
5-th percentile0.6931471806
Q11.609437912
median2.079441542
Q32.397895273
95-th percentile2.564949357
Maximum2.564949357
Range1.871802177
Interquartile range (IQR)0.7884573604

Descriptive statistics

Standard deviation0.5751904668
Coefficient of variation (CV)0.2991363502
Kurtosis-0.4315473294
Mean1.922837082
Median Absolute Deviation (MAD)0.4787928656
Skewness-0.8413490009
Sum19228.37082
Variance0.3308440731
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0.69314718 0.89587973 1.24245332 1.49786614 1.70059869 ... 2.13833306 2.24990484 2.35024018 2.524928 2.56494936], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
2.397895273 1057 10.6%
 
2.48490665 999 10.0%
 
2.564949357 972 9.7%
 
2.302585093 923 9.2%
 
0.6931471806 904 9.0%
 
2.197224577 794 7.9%
 
2.079441542 771 7.7%
 
1.945910149 740 7.4%
 
1.791759469 740 7.4%
 
1.098612289 728 7.3%
 
Other values (2) 1372 13.7%
 
ValueCountFrequency (%) 
0.6931471806 904 9.0%
 
1.098612289 728 7.3%
 
1.386294361 696 7.0%
 
1.609437912 676 6.8%
 
1.791759469 740 7.4%
 
ValueCountFrequency (%) 
2.564949357 972 9.7%
 
2.48490665 999 10.0%
 
2.397895273 1057 10.6%
 
2.302585093 923 9.2%
 
2.197224577 794 7.9%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Missing values

Sample

First rows

is_bademp_lengthhome_ownershipannual_incverification_statuspymnt_planpurpose_cataddr_statedebt_to_incomedelinq_2yrsinq_last_6mthsmths_since_last_delinqmths_since_last_recordopen_accpub_recrevol_balrevol_utiltotal_accinitial_list_statusmths_since_last_major_derogpolicy_codecr_line_yrscr_line_mths
002.397895mortgage10.819798not verifiednmedicaltx10.870.0000000.0000000.0000004.7874922.7725890.09.39996912.13.806662f0.693147pc47.5973962.564949
100.693147rent10.576866not verifiedndebt consolidationks9.150.0000001.0986120.0000004.7874921.6094380.09.22177564.01.791759f1.098612pc17.6038982.484907
201.609438rent11.082158not verifiedncredit cardca11.240.0000000.0000000.0000004.7874921.6094380.04.4067190.62.197225f1.386294pc47.5862961.945910
302.397895mortgage10.959558not verifiedndebt consolidationny6.180.6931470.0000002.8332134.7874921.9459100.09.21343637.13.178054f1.098612pc27.5923662.302585
402.397895mortgage10.819878verified - incomendebt consolidationoh19.030.0000001.6094380.0000004.7874922.1972250.09.28182340.43.091042f1.386294pc37.6009022.397895
501.609438rent10.758520verified - incomenotherdc7.831.0986120.6931472.9957324.7874921.9459100.07.44775126.43.258097f1.386294pc37.6009022.564949
602.397895mortgage11.744045not verifiedncredit cardny14.280.0000000.0000000.0000004.7874922.9444390.08.60648511.13.401197f1.386294pc17.5908522.484907
701.945910mortgage10.645449verified - income sourcendebt consolidationnv10.290.0000000.0000000.0000004.7874922.3025850.09.24522595.92.397895f1.386294pc37.6043961.609438
801.098612mortgage10.819798verified - incomendebt consolidationil15.360.0000001.0986120.0000004.7874922.4849070.09.88649459.23.332205f0.693147pc57.6019021.098612
900.693147rent10.596660not verifiedncarca6.480.0000000.6931470.0000004.7874922.4849070.09.90343818.33.178054f0.693147pc57.5989001.791759

Last rows

is_bademp_lengthhome_ownershipannual_incverification_statuspymnt_planpurpose_cataddr_statedebt_to_incomedelinq_2yrsinq_last_6mthsmths_since_last_delinqmths_since_last_recordopen_accpub_recrevol_balrevol_utiltotal_accinitial_list_statusmths_since_last_major_derogpolicy_codecr_line_yrscr_line_mths
999002.397895mortgage11.695255verified - incomenhome improvementmi14.440.6931470.0000001.6094384.7874922.7080500.0000009.59675959.83.465736f1.098612pc27.5983991.098612
999102.397895rent11.050906verified - income sourcenmedicalma10.080.0000000.0000000.0000004.7874921.9459100.0000004.1108741.13.135494f1.386294pc17.5958901.791759
999202.397895rent10.859018verified - incomendebt consolidationny23.700.0000000.0000004.2626804.7874922.1972250.0000009.61600591.52.944439f1.098612pc57.6004022.197225
999302.397895own11.470988verified - incomenhome improvementny8.700.0000001.0986120.0000004.7874921.3862940.0000007.66856130.62.079442f1.386294pc57.5989002.079442
999410.693147rent10.126511verified - income sourcendebt consolidationca3.790.0000000.0000000.0000004.7874921.0986120.0000008.47678856.52.079442f0.693147pc17.6038982.197225
999501.791759mortgage11.101206verified - incomenweddingma9.400.0000000.6931470.0000004.7874922.1972250.0000008.20439824.12.397895f1.098612pc37.6019022.302585
999600.693147rent10.165890verified - income sourcendebt consolidationny20.490.0000000.6931474.3820274.7874922.1972250.0000008.81135458.92.564949f1.098612pc37.6014021.791759
999702.197225rent10.775450not verifiedndebt consolidationnj24.130.0000000.0000000.0000004.7184992.3025850.6931479.33670960.72.890372f1.386294pc37.5958902.564949
999801.945910mortgage11.156265not verifiednmajor purchaseva16.181.0986121.0986122.8332134.7874922.3025850.0000009.75022050.93.332205f1.098612pc37.6009021.386294
999900.693147rent11.164233not verifiedncredit cardca16.130.0000000.6931473.9889844.7874922.7725890.0000007.74283622.63.555348f1.098612pc57.6014022.302585